xxxxxxxxxx<!--<center><!--<center><img src="https://cf-courses-data.s3.us.cloud-object-storage.appdomain.cloud/IBMDeveloperSkillsNetwork-DS0101EN-SkillsNetwork/labs/Module%201/images/SN_web_lightmode.png" width="300"></center><br/>-->## IBM Cloud GalleryEstimated Time (45 min)IBM Cloud Resource hub is a growing collection of data sets, notebooks, and project templates. In this lab, you will use *IBM Cloud Resource hub* to explore different datasets. As you learned in the course, data can be more than just numbers. Data can be numeric, text, images, videos, audios and more. You will look at three samples.**Sample 1** contains data with only numeric attributes.**Sample 2** contains data with numeric & text attributes.**Sample 3** cantains a Jupyter Notebook, a tool which data scientists use to create models.Let\'s take a look at how data scientists use different datasets.#### Objectives :You will learn to:* Explore the IBM Cloud Resource hub* Examine a numeric dataset* Examine a dataset with non-numeric attributes* Examine a Jupyter Notebook#### Exercise 1: Examine a numeric dataset1. Click on the link: https://dataplatform.cloud.ibm.com/gallery2. Click the filter button in the top right of the window:<img src="https://cf-courses-data.s3.us.cloud-object-storage.appdomain.cloud/i3wHdL5V2zxB_s1iS847_w/filterbutton.jpg" width="600" height="200" style="border: solid 1px grey; margin-top: 30px; margin-left: 30px; margin-bottom: 30px;">3. In the dropdown menu that appears, select the _Data_ checkbox under _Sample type_. Then click on the _Tags_ dropdown, and select the _Environment_ checkbox.4. In the search results, click on _UCI: Forest Fires_. 5. Preview the data using the _Preview_ option. ##### Explore the dataThe data is related to forest fires where the aim is to predict the burned area of forest fires, in the northeast region of Portugal, by using meterological and other data. **Attribute Information:**1. X - x-axis spatial coordinate within the Montesinho park map: 1 to 92. Y - y-axis spatial coordinate within the Montesinho park map: 2 to 93. month - month of the year: \'jan\' to \'dec\'4. day - day of the week: \'mon\' to \'sun\'5. FFMC - FFMC index from the FWI system: 18.7 to 96.206. DMC - DMC index from the FWI system: 1.1 to 291.37. DC - DC index from the FWI system: 7.9 to 860.68. ISI - ISI index from the FWI system: 0.0 to 56.109. temp - temperature in Celsius degrees: 2.2 to 33.3010. RH - relative humidity in %: 15.0 to 10011. wind - wind speed in km/h: 0.40 to 9.4012. rain - outside rain in mm/m2 : 0.0 to 6.413. area - the burned area of the forest (in ha): 0.00 to 1090.84(this output variable is very skewed towards 0.0, thus it may makesense to model with the logarithm transform).### Exercise 2: Evaluate a non-numeric datasetThe data doesn\'t have to be only based on numbers. Data can be text, images and other types as well. Let\'s look at a dataset which has text values.1. At the top of the page, select the _Resource hub_ option.<img src="https://cf-courses-data.s3.us.cloud-object-storage.appdomain.cloud/N5dwDitUC-AT1VqyON-IxQ/Gallery.jpg" width="500" height="300" style="border: solid 1px grey; margin-top: 30px; margin-left: 30px; margin-bottom: 30px;">2. Type _Airbnb_ into the search bar.<img src="https://cf-courses-data.s3.us.cloud-object-storage.appdomain.cloud/Do_fUa3gOdBtl1UUGKrcbA/searchairbnb.jpg" width="500" height="200" style="border: solid 1px grey; margin-top: 30px; margin-left: 30px; margin-bottom: 30px;">3. Select the _Airbnb Data for Analytics: Trentino Reviews_ option. You may need to scroll to find it.4. Preview the data using the _Preview_ option.<img src="https://cf-courses-data.s3.us.cloud-object-storage.appdomain.cloud/IBMDeveloperSkillsNetwork-DS0101EN-SkillsNetwork/labs/Module%202/images/Airbnb_preview.png" style="border: solid 1px grey;margin-top: 30px; margin-left: 30px; margin-bottom: 30px">##### Explore the dataAirbnb, Inc. is an American company that operates an online marketplace for lodging, primarily homestays for vacation rentals, and tourism activities. Airbnb guests may leave a review after their stay, and these can be used as an indicator of airbnb activity. The minimum stay, price and number of reviews have been used to estimate the occupancy rate, the number of nights per year and the income per month for each listing.You could use this data in multitude of ways - to analyze the star ratings of places, to analyze the location preferences of the customers, to analyze the tone and sentiment of customer reviews and many more. Airbnb uses location data to improve guest satisfaction.>💡 What else might you use this data for?The dataset comprises of three main tables: - listings - Detailed listings data showing 96 attributes for each of the listings. Some of the attributes used in the analysis are price(continuous), longitude (continuous), latitude (continuous), listing_type (categorical), is_superhost (categorical), neighbourhood (categorical), ratings (continuous) among others.- reviews - Detailed reviews given by the guests with 6 attributes. Key attributes include date (datetime), listing_id (discrete), reviewer_id (discrete) and comment (textual).- calendar - Provides details about booking for the next year by listing. Four attributes in total including listing_id (discrete), date(datetime), available (categorical) and price (continuous).### Exercise 3: Evaluate Jupyter NotebookReturn to the Resource hub. Select _Notebook_ from the _Sample type_ menu that appears after clicking on the filter button. In the search bar type _Finding optimal locations_ Select the card that says _Finding optimal locations of new stores using..._This Jupyter notebook uses _Decision Optimization_ with Python to help determine the optimal location of a new store. This Notebook aims to identify where to place a coffee shop that minimizes the total distance from libraries in the area to the shop so that a book reader can get to the shop easily. <img src="https://cf-courses-data.s3.us.cloud-object-storage.appdomain.cloud/IBMDeveloperSkillsNetwork-DS0101EN-SkillsNetwork/labs/Module%202/images/Notebook.png" style="border: solid 1px grey;margin-top: 30px; margin-left: 30px; margin-bottom: 30px">Part of the Python code in the notebook displays the locations of the libraries on a map.<img src="https://cf-courses-data.s3.us.cloud-object-storage.appdomain.cloud/IBMDeveloperSkillsNetwork-DS0101EN-SkillsNetwork/labs/Module%202/images/loc_jupyter.png" style="border: solid 1px grey;margin-top: 30px; margin-left: 30px; margin-bottom: 30px">But with this data, you cannot determine the ideal location of the coffee shops by just looking at the map.The code then solves this with an optimization model that will help determine possible locations for the coffee shops with the stipulation of minimizing the distance between the libraries and the shop.<img src="https://cf-courses-data.s3.us.cloud-object-storage.appdomain.cloud/IBMDeveloperSkillsNetwork-DS0101EN-SkillsNetwork/labs/Module%202/images/loc2_jupyter.png" style="border: solid 1px grey;margin-top: 30px; margin-left: 30px; margin-bottom: 30px">#### SummaryIn this lab, you have learnt about to explore datasets and notebooks in IBM cloud Resource hub.# Author(s)<h4><a href = "https://www.linkedin.com/in/malika-goyal-04798622/">Malika Singla</a> <h4/># Other Contributor(s)<h4><a href = "https://www.linkedin.com/in/lavanya-sunderarajan-199a445/">Lavanya</a> <h4/><!--## Change log| Date | Version | Changed by | Change Description ||------|--------|--------|---------|| 2024-05-02 | 1.3 | Sathya Priya | Updated Screenshots and instructions|| 2023-10-09 | 1.3 | Bethany Hudnutt | Clarified Language and updated images|| 2022-10-27 | 1.3 | Lakshmi Holla| Updated Instructions|| 2022-07-22 | 1.2 | Appalabhaktula Hema| Updated Screenshots and instructions|| 2022-02-16 | 1.1 | Niveditha | Updated watson Screenshot || 2021-06-010 | 1.0 | Malika Singla | Initial Version | --><footer><img align="left" src="https://cf-courses-data.s3.us.cloud-object-storage.appdomain.cloud/nRmYgyM2KjRIIiG16R7ikg/ibmsn-footer-blue.png" /></footer><br/>Estimated Time (45 min)
IBM Cloud Resource hub is a growing collection of data sets, notebooks, and project templates. In this lab, you will use IBM Cloud Resource hub to explore different datasets. As you learned in the course, data can be more than just numbers. Data can be numeric, text, images, videos, audios and more. You will look at three samples.
Sample 1 contains data with only numeric attributes.
Sample 2 contains data with numeric & text attributes.
Sample 3 cantains a Jupyter Notebook, a tool which data scientists use to create models.
Let's take a look at how data scientists use different datasets.
You will learn to:
Click on the link: https://dataplatform.cloud.ibm.com/gallery
Click the filter button in the top right of the window:
In the dropdown menu that appears, select the Data checkbox under Sample type. Then click on the Tags dropdown, and select the Environment checkbox.
The data is related to forest fires where the aim is to predict the burned area of forest fires, in the northeast region of Portugal, by using meterological and other data.
Attribute Information:
The data doesn't have to be only based on numbers. Data can be text, images and other types as well. Let's look at a dataset which has text values.
At the top of the page, select the Resource hub option.
Type Airbnb into the search bar.
Select the Airbnb Data for Analytics: Trentino Reviews option. You may need to scroll to find it.
Airbnb, Inc. is an American company that operates an online marketplace for lodging, primarily homestays for vacation rentals, and tourism activities. Airbnb guests may leave a review after their stay, and these can be used as an indicator of airbnb activity. The minimum stay, price and number of reviews have been used to estimate the occupancy rate, the number of nights per year and the income per month for each listing.
You could use this data in multitude of ways - to analyze the star ratings of places, to analyze the location preferences of the customers, to analyze the tone and sentiment of customer reviews and many more. Airbnb uses location data to improve guest satisfaction.
💡 What else might you use this data for?
The dataset comprises of three main tables:
listings - Detailed listings data showing 96 attributes for each of the listings. Some of the attributes used in the analysis are price(continuous), longitude (continuous), latitude (continuous), listing_type (categorical), is_superhost (categorical), neighbourhood (categorical), ratings (continuous) among others.
reviews - Detailed reviews given by the guests with 6 attributes. Key attributes include date (datetime), listing_id (discrete), reviewer_id (discrete) and comment (textual).
calendar - Provides details about booking for the next year by listing. Four attributes in total including listing_id (discrete), date(datetime), available (categorical) and price (continuous).
Return to the Resource hub. Select Notebook from the Sample type menu that appears after clicking on the filter button. In the search bar type Finding optimal locations Select the card that says Finding optimal locations of new stores using…
This Jupyter notebook uses Decision Optimization with Python to help determine the optimal location of a new store.
This Notebook aims to identify where to place a coffee shop that minimizes the total distance from libraries in the area to the shop so that a book reader can get to the shop easily.
Part of the Python code in the notebook displays the locations of the libraries on a map.
But with this data, you cannot determine the ideal location of the coffee shops by just looking at the map.
The code then solves this with an optimization model that will help determine possible locations for the coffee shops with the stipulation of minimizing the distance between the libraries and the shop.
In this lab, you have learnt about to explore datasets and notebooks in IBM cloud Resource hub.